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(57) Abstract: The disclosure describes a method 
of embedding a digital watermark [110] into a 
video signal using a time-based perceptual mask 
[106] sueh that the digital watermark [110] is 
substantially imperceptible in the video signal 
[112]. A digital watermark embedder computes 
a time-based perceptual mask [106] comprising 
gain values corresponding to locations within a 
frame. The gain value for a location in the frame 
is changed as a function of the change in one 
or more pixel values over time. The embedder 
uses the gain values of the time-based perceptual 
mask to control embedding of corresponding 
elements of a digital watermark signal such that 
the perceptibility of the elements of the digital 
watermark signal is reduced in time varying 
locations of the video signal. 
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TIME AND OBJECT BASED MASKING FOR VIDEO WATERMARKING 

Technical Field 

5 The invention relates to steganography, digital watermarking, and data hiding. 



Background and Summary 
Digital watermarking is a process for modifying physical or electronic media to 
embed a machine-readable code into the media. The media may be modified such that 

1 0 the embedded code is imperceptible or nearly imperceptible to the user, yet may be 
detected through an automated detection process. Most commonly, digital 
watermarking is applied to media signals such as images, audio signals, and video 
signals. However, it may also be applied to other types of media objects, including 
documents (e.g., through line, word or character shifting), software, multi-dimensional 

1 5 graphics models, and surface textures of objects. 

Digital watermarking systems typically have two primary components: an 
encoder that embeds the watermark in a host media signal, and a decoder that detects 
and reads the embedded watermark from a signal suspected of containing a watermark 
(a suspect signal). The encoder embeds a watermark by altering the host media signal. 

20 The reading component analyzes a suspect signal to detect whether a watermark is 

present. In applications where the watermark encodes information, the reader extracts 
this information from the detected watermark. 

Several particular watermarking techniques have been developed. The reader is 
presumed to be familiar with the literature in this field. Particular techniques for 

25 embedding and detecting imperceptible watermarks in media signals are detailed in the 
assignee's co-pending application serial number 09/503,881 and US Patent 6,122,403, 
which are hereby incorporated by reference. Examples of other watermarking 
techniques are described in US Patent Application 09/404,292, which is hereby 
incorporated by reference. Additional features of watermarks relating to authentication 

30 of media signals and fragile watermarks are described in US Patent application 

60/198,138, 09/498,223, 09/433,104, and 60/232,163, which are hereby incorporated by 
reference. 
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The problem with video watermarking is that many static image based 
watermark systems or static watermarking systems have been adapted to video, where 
"static" refers to processes that do not account for changes of multimedia content over 
time. However, video is dynamic with respect to time. For example, a mostly invisible 
5 image watermark may be visible in video because as the image changes and the 

watermark remains the same, the watennark can be visibly perceived, ha other words, 
the problem is that the watennark may be mostly invisible in each frame, but the 
motion of an object through the stationary watermark makes the watermark visible in 
video. Similarly, an invisible watermark in a video may be visible in each frame, just 

10 as artifacts due to lossy compression are imperceptible in video, yet visible if individual 
frames of the video are examined as still images. It is believe that our eyes and brain 
average these effects over time to remove the distortion. 

The invention provides a method of embedding a digital watermark into a video 
signal using a time-based perceptual mask such that the digital watermark is 

1 5 substantially imperceptible in the video signal, hi other words, the watermark is 

reduced in value where it can be perceived due to the dynamics of video as described 
above. A digital watermark embedder computes a time based perceptual mask 
comprising gain values corresponding to locations within a frame. The gain value for a 
location in the frame is changed as a function of the change in one or more pixel values 

20 at the location over time. The embedder uses the gain values of the time based 

perceptual mask to control embedding of corresponding elements of a digital watermark 
signal such that the perceptibility of the elements of the digital watermark signal is 
reduced in time varying locations of the video signal. This inter-frame time-based gain 
coefficient can be combined with intra-frame spatial-based gain coefficients that make 

25 watermarks mostly invisible in each frame based upon static-image perception, or less 
visible in each static frame and completely invisible in video based upon spatial video 
perceptual theory or experimentation. 

An alternative method is to segment objects and have the watermarks move with 
each object, labeled object-based masking. The segmentation must be accurate to 

30 alleviate edge effects. This method may be very applicable with MPEG-4 where the 
video is stored as individual objects. 
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Further features of the invention will become apparent from the following 
detailed description and accompanying drawing. 

Brief Description of the Drawing 
5 Fig. 1 illustrates a diagram of a digital watermark embedder for video using 

time based perceptual masking to reduce visibility of the watermark. 

Detailed Description 
Time-based Masking of Video Watermarks 

10 An improvement is to change the gain of the watermark depending upon the 

dynamic attributes of the local area around the watermark. Specifically, if the pixel 
represents a changing or moving area, the watermark is reduced in value, unless the 
movement is chaotic or noise-like, in which case the gam can remain large. 

More specifically, given the current value for one pixel, if that current value is 

15 similar to the values before and after the current frame (for the same pixel), the 
watermark gain, labeled time-gain, for that pixel should be near 1 . The time-gain 
should drop as the values of that pixel change in time, as long as the change is steady 
over time. The more the steady change, the smaller the time gain, where change can be • 
measured as absolute difference or statistical variance. This should be repeated for 

20 each pixel or group of pixels in the frame. However, if the change in the pixel or group 
of pixels is chaotic or noise-like, the time gain can remain near 1 since noisy 
environments are a good place to hide watermarks. In addition, we may want to look 
only at the frame before and after or two or more frames in each time-direction. To this 
end, if the pixel represents a changing or moving area, the watermark is reduced in 

25 value. 

Alternatively, one may want to determine the gain only from past values so that 
the system is causal and the embedder causes no delay. This can be accomplished by 
using the past values to calculate the gain directly or to estimate the future value and 
calculate the gain using this estimate. In one embodiment, the estimate(s) can be 
30 dependent upon the slope and change in slope of the current pixel value and previous 
values, and the resulting time-gain can be based upon the variance of the three existing 
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values and estimated value(s). The predictive frames used in most video compression 
schemes, such as MPEG p and b frames, can be used to set the time gain. 

Fig. 1 illustrates a diagram of a digital watermark embedder for video using 
time based perceptual masking to reduce visibility of the watermark. The inputs to the 
5 embedder include a video stream 100 and an auxiliary data message to be imperceptibly 
embedded into the video stream. Conceptually, there are two components of the 
embedder: a message pre-processor for transforming the auxiliary data into an 
intermediate signal for embedding into the host video stream, and a human 
perceptibility system analyzer for computing a mask used to control the embedding of 
10 the intermediate signal into the host video stream. 

The message pre-processor transforms the message signal into an intermediate 
signal according to a protocol for the desired digital watermark application. This 
protocol specifies embedding parameters, like: 

the size of the message as well as number and meaning of data fields in the 
15 message; 

the symbol alphabet used for the message elements, e.g., binary, M-ary etc. 
the type of error correction coding applied to the message; 
the type of error detection scheme applied to the message; 
the type and nature of the carrier signal modulated with the message signal; 
20 the sample resolution, block size, and transform domain of the host signal to 

which elements of the intermediate are mapped for embedding; etc. 

The example shown in Fig. 1 pre-processes as follows (104). First, it applies 
error correction coding to the message, such as turbo, BCH, convolutional, and/or Reed 

25 Solomon coding. Next it adds error detection bits, such as parity bits and/or Cyclic 
Redundancy Check (CRC) bits. The message 102 includes fixed bits (e.g., a known 
pattern of bits to verify the message and synchronize the reader) and variable bits to 
carry variable data, such as frame number, transaction ID, time stamp, owner ID, 
content ID, distributor ID, copy control instructions, adult rating, etc. 

30 The embedder modulates the message with a carrier signal, such as a pseudo 

random sequence, features of the host video signal 100, or both. The embedder also 
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maps elements of the intermediate signal to samples in the host video signal (e.g., 
particular samples in the spatial or frequency domain of the video signal). The mapping 
function preferably replicates instances of the message throughout the video signal, yet 
scrambles the message instances such that they are more difficult to visually perceive 
5 and detect through analysis of the video stream. For more about message processing 
for digital watermark embedding, see U.S Patent Application No. 09/503,881 and US 
Patent 6,122,403. 

The human perceptibility analyzer calculates an "intraframe" perceptual mask 
(106) based on spatial visual attributes within a frame. This mask provides a vector of 

10 gain values corresponding to locations within the frame and indicating the data hiding 
capacity of the image at these locations in the frame. These gain values are a function 
of signal activity (e.g., a measure of local variance, entropy, contrast), luminance, and 
edge content (as measured by an edge detector or high pass filter) at locations within 
the frame. Locations with higher signal activity and more dense edge content have 

1 5 greater data hiding capacity, and therefore, the signal energy with which the 

intermediate signal is embedded can be increased. Similarly, the changes made to the 
host signal due to the embedding of the watermark can be increased in these areas. 
Further examples of such perceptual masking are provided in U.S Patent Application 
No. 09/503,881 and US Patent 6,122,403. 

20 The human perceptibility analyzer also calculates a time based perceptual mask 

(108) as introduced above. The time based perceptual analyzer determines how pixels 
in a local area change over time (e.g., from frame to frame), and adjust the gain of the 
perceptual mask accordingly. If the pixels in the local area change less then a 
predetermined threshold, then the gain in the perceptual mask is relatively unchanged. 

25 If the pixels in the local area change in a smoothly varying manner over time, then the 
gain in the perceptual mask is reduced to reduce the visibility of the digital watermark. 
Finally, if the pixels in the local area change in a highly varying manner, e.g., in a 
chaotic or substantially random manner, then the gain in the perceptual mask is 
increased to reflect the increased data hiding capacity of that location in the video 

30 stream. 
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As noted previously, there are a variety of ways to measure the time varying 
changes of pixels at a location. One way is to use a statistical measure such as the 
mean, variance or standard deviation, and change in variance or standard deviation of 
pixel values (e.g., luminance) over time at a location. For example, a variance near 0, 
5 i.e. below a pre-determined threshold, identifies a stationary area, resulting in a time- 
gain near or greater than 1. A variance greater than the threshold with minimal change 
in variance identifies a smoothly varying location, resulting in a time-gain below 1 . A 
variance greater than the threshold, but with a large change in variance, identifies a 
noisy area, resulting in a time-gain near or greater than 1. 

10 Another measure is the absolute change of a pixel value at a location, along with 

the time-derivative or rate of change of the absolute change in pixel value. A related 
measure is to determine how a pixel is changing by measuring absolute value and/or 
changes in motion vectors for that location (e.g., pixel or block of pixels). Calculating 
motion vectors is well known in the state of the art of video compression. For 

15 compressed video streams, this motion vector data is part of the data stream, and be 
used to determine the gain for embedding the intermediate signal in spatial domain 
samples or frequency domain coefficients (e.g., DCT or wavelet coefficients). For 
example, a non-near zero (i.e. above the pre-determined threshold) smoothly varying 
motion vector identifies a smoothly changing location and results in a reduced time- 

20 gain value. A near zero motion vector or chaotically changing motion vector identifies 
a stationary or noisy location, respectively, and both result in a time-gain value near or 
above 1. 

Alternatively, the system may use color values or combinations of colors that 
are more accurate than luminance to predict perceptibility of the watermark. For 
25 example, psycho-visual research may determine that watermarks are more visible in red 
during motion, and the system can be adapted to accommodate this finding. 

The optimal value of the time-gain will be determined via human perception 
experiments with actual video. 

After computing the perceptual mask in blocks 106 and 108, the embedder uses 
30 the mask to control embedding of the intermediate signal into the host video stream. In 
one implementation, for example, the gain is applied as a scale factor to the 
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intermediate signal, which in turn, is added to corresponding samples of the video 
signal (e.g., either spatial or frequency domain samples). The result is a video stream 
with a hidden digital watermark 1 12. 

A further innovation is to apply a time varying dither signal to control the 
5 strength of the digital watermark signal at locations corresponding to pixels or groups 
of pixels (e.g., 8 by 8 block of DCT coefficients, group of wavelet subband coefficients, 
etc.) in the host video stream. This dither signal is preferably random, such as a pseudo 
random signal generated by a pseudorandom number generator (a cryptographic hash). 
It may be implemented by applying it to the intra frame gain or to the time-varying gain 
10 of the digital watermark signal. The dither creates a perturbation of the gain value. For 
example, if the gain value is one, the dither creates a fractional perturbation around the 
value of one. 

In one implementation, the dither for a pixel or group of neighboring pixel 
locations in a video stream varies over time and relative to the dither for neighboring 

1 5. pixel or group locations. In effect, the dither creates another form of time varying gain. 
The dither signal improves the visual quality of the digitally watermarked video signal, 
particularly in areas where the watermark might otherwise cause artifacts due to the 
difference in time varying characteristics of the host video signal relative to the 
watermark signal. The dither signal may be used with or without the time varying gain 

20 calculations described in this document. Further, the user should preferably be allowed 
to turn the dither on or off as well as vary the gain of the dither in the digital watermark 
embedding environment (on a frame, video object, or video scene basis). 
Object-based Masking of Video Watermarks 

Another method to provide invisible watermarks for video is object-based 

25 masking. The method is to segment objects and have the watermarks move with each 
object, referred to as object-based masking. The digital watermark for one or each 
video object is designed to be invisible spatially within the object, and since the 
watermark moves with the object, motion cannot make the watermark visible. 

The segmentation must be accurate to alleviate edge effects. The segmentation 

30 can be performed on the composite video or on each video stream before the final 
mixing. 
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If all objects are embedded, the system should take care to make sure that the 
watermarks do not interfere with each other, hi one such embodiment, the background 
is not watermarked. In another, the objects contain payloads that are all spatially 
synchronized with a low-level background calibration signal (for example, subliminal 
5 graticules disclosed in U.S. Patent 6,122,403). This calibration signal is not perceptible 
and helps the system synchronize with each object's bit carrying payload. 

After one or more objects are watermarked, the video is saved as composite, 
such as in MPEG-2, or in an object based method, such as MPEG-4 formatted video. 
In other words, the composite video maybe created before distribution or at the player. 
10 For MPEG-2, the embedding system can guarantee that payloads for each object do not 
interfere with each other. For MPEG-4, each object's watermark payload can be read 
before rendering, or can be designed not to interfere with the composite video. 

Related Applications 

Digital watermark technology may be used in a variety of applications. One 

1 5 such application is a method to connect a media signal, such as an audio signal, video 
signal or still image to a network resource. This method operates in a computer 
network environment. Operating in a network connected device, the method extracts an 
identifier from a media signal, such as from a digital watermark, perceptual hash, or 
other machine extracted signal identifier. It then sends the identifier to a network along 

20 with context information indicating device type information. From the network, the 
method receives related data associated with the media signal via the identifier. The 
related data is adapted to the network connected device based on the device type 
information. This device type information may include a display type, so that the 
related date may be formatted for rendering on the display type of the device. This 

25 device type information may also include a connection speed so that the related data 
maybe optimized for the connection speed of the device. 

Connected content refers to a method of connecting multimedia content, such as 
an image, video stream or audio clip, to a network resource, such as a web page or other 
program. As described in this document, one way to form connected content is to 

30 include a unique identifier (ID) in the content, and link the content to related data, 
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possibly using a web page URL, via a central Internet web router and server based upon 
the unique ID and secondary database. The ID can be the part or all of the message 
payload of a digital watermark embedded in the content signal so it is inherently 
distributed with the content, or as a meta tag, possible contained in the header or footer 
5 of the file and potentially locked to the file via encryption. 

When the user wants to display the related data, possibly by clicking on an icon 
that displays that the content is connected, the user may be using one of many types of 
devices. Devices can have the following displays a computer monitor, a TV screen, or 
a small screen on a portable Internet appliance or a cell phone. Each device can have a 

10 low- or high-speed (bps) connection to the Internet. Each display has its unique 

characteristics, such as a computer screen can display fine grain detail and text whereas 
a TV cannot. An Internet appliance and cell phone have small displays. Each may 
have a high or low speed connection. Thus, by sending the context of situation, such as 
the display features and/or Internet connection speed, to the central web server, the 

1 5 correct type of content can be returned to the display device. 

Specifically, the web page may have tags that determine the type of devices that 
each segment of the web page should be sent. The segments could be defined with 
XML tags of the format <begin tag> segment data </end tag>. More specifically, a web 
page could look like <small displayxpc monitor><TV monitor size=+4> html segment 

20 data </small display></TV monitor> html segment data <high speed> html image data 
</pc monitor> </high speed>. Thus, with a computer monitor on a high-speed 
connection, all of the content will be sent. In contrast, with a cell phone only some of 
the text is sent. Or, with a TV screen some of the text is sent and it is reformatted to a 
larger font. 

25 Alternatively, the web page may contain several complete but different versions, 

divided by display type and connection speed. 

Importantly, the display type and connection speed must be communicated via 
the web router to the web page server, so the correct context sensitive data can be 
returned. These features can also be sent using XML structure, such as <speed> high 
30 speed </speed> <display> PC Monitor </display>. 
Some features of this method include: 
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1 . A method of connecting multimedia content to a network resource 
comprising: 

in a network connected device, extracting an identifier from a media signal; 
sending the identifier to a network along with context information indicating 
5 device type information, where the identifier is used to look up related data for the 
media signal; 

from the network, receiving the related data associated with the media signal via 
the identifier, where a format of the related data is adapted to the network connected 
device based on the device type information. 

10 

2. The method of feature 1 wherein the device type information includes a 
display type of the network connected device, and the related data received from the 
network is formatted for the display type based on the device type information from the 
network connected device. 

15 

3. The method of feature 1 wherein the device type information includes a 
connection speed of the network connected device, and the related data received from 
the network is selected based on the connection speed. 

20 4. A computer readable medium on which is stored instructions for performing 

the method of feature 1. 

5. The method of feature 1 wherein the identifier comprises a digital watermark 
embedded in the media signal. 

25 

Time stamped watermark 

Another application of a digital watermark is to control usage of multimedia files 
in a file-sharing network as described in PCT Application PCT/US01/22953. By 
including the creation or release date of the content to its watermark or embedded data 
30 (defined as time stamped watermark), the content's usage can be controlled over time. 
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fti file sharing networks, a song or movie with a time stamped watermark can 
enter different areas of the file sharing operation dependent upon the current date. The 
current date can come from the local clock, which is easy to change, or a central clock 
on a secure server, which is difficult to change, hi simplest form, the file is not allowed 
5 to be shared for one month after its release and is allowed to be shared after that. This 
allows the record labels to capitalize on different market segments at different times, 
just as the movie industry does with VHS and DVD releases occurring a month or so 
after the theatre release. 

Alternatively, the file could propagate through the file-sharing network over time, 
starting in the premium section, then moving to the basic section, possibly one month 
later, and finally entering the year section, possibly one year later. 

In reference to digital asset management systems, the time stamped watermark 
could be used to find the most recent version of the file. For example, if Ford wanted to 
use the most recent image of its Fl 50 truck, it could compare the embedded date of the 
current picture to that of the latest entry . into its digital asset management system to find 
the most recent version. 

More file sharing enhancements 

This section describes a number of additional enhancements for the use of 
auxiliary data embedded in multimedia files in a file sharing systems, including 
1. Using different beginning and ending frame payloads to determine a successful 
download of a multimedia file (e.g., an audio or video file), or using a header indicating 
the number of frames in the media signal so that the receiver can check whether the 
received frames matches the number indicated in the header. 

2. Streaming compressed audio or video file from a distributing server to a user's client 
computer when the user does not have usage rights for that file to enable the user to 
preview the audio or video file. This system only requires server side security to keep 
the file from being tampered with, and server side security is easier to implement. 

3. Hash audio in each frame to two or more bytes and use the hash to modulate bits of 
the auxiliary data because it makes it more difficult to change the audio signal while 
maintaining a predetermined relationship between the audio data and the auxiliary data 
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that has been modulated with a hash of the audio data. This method applies to auxiliary 
data for video files as well. It applies to embedding data into the media signal itself by 
imperceptibly altering the signal with a digital watermark, as well as embedding the 
auxiliary data in a file or frame header/footer. 
5 4. Choose frames or data within frames randomly used to modify the auxiliary data, 
based upon a PN sequence to make it more difficult to change the host audio or video 
signal of the auxiliary data. 

5. Branding the label by displaying the label's name and/or logo while searching and/or 
downloading the file by determining the content provider from the embedded unique ID 

10 or content owner section. 

6. Linking back to the retailer where the music was originally bought via a transaction 
watermark or embedded data containing the retailer's ID. 

7. Automatically generating the embedded ID using a hash of the CD table of contents 
(TOC) and/or track, with the TOC hash possibly matching that of CDDB. 

15 

Some features of a method of using auxiliary data embedded in files within file 
sharing systems include: 

1 . In a file sharing system, a method of controlling use of media files 
comprising: 

20 embedding auxiliary data into a media signal file, including a time stamp; 

extracting the auxiliary data from the media signal; 

reading the time stamp from the extracted auxiliary data to control use of the 
media signal file in the file sharing system. 

25 2. The method of feature 1 wherein the auxiliary data is embedded in a digital 

watermark in a media signal within the media signal file. 

3. A computer readable medium on which is stored instructions for performing 
the method of feature 1. 

30 

4. The method of feature 1 wherein the extracted data is used to control 
rendering of the media signal file in the file sharing system. 
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5. The method of feature 1 wherein the extracted data is used to control transfer 
of the media signal file in the file sharing system. 

6. The method of feature 1 wherein the time stamp from the extracted data is 
5 compared with a time of processing, and usage rights are determined hased on the 

relative time between the time stamp and the time of processing of the media signal file. 

7. The method of feature 6 wherein the file is not allowed to be shared within a 
period of time as measured by a comparison of the time stamp and the time of 

10 processing. 

8. The method of feature 6 wherein the file is allowed to enter an additional 
section of the file sharing system as more time elapses between a time indicated in the 
time stamp and the time of processing. 

15 

9. The method of feature 8 wherein the section corresponds to a level of 
subscription in the file sharing system. 

10. The method of feature 1 wherein the time stamp is used to find a version of 
20 the media signal file based on the time stamp embedded in the file. 

Time Codes in Video and Audio Watermark Payloads 

For a number of applications, it is useful to embed time or sequence codes in 
video and audio watermarks. Preferably, these codes are embedded in a sequence of 

25 frames that comprise the video or audio stream of interest. One way to implement the 
code is to increment the code for each frame or group of neighboring frames in the time 
dimension, starting from the beginning of the video or audio clip, and continuing to the 
end of a portion to be marked. Another way is to embed a code indicating the number 
of frames between succeeding watermark payloads. These codes enable later 

30 authentication of the video or audio stream by extracting the digital watermark from 
each frame or group of frames, and then checking to determine whether the extracted 
codes are complete and in the same order as at the time of embedding. Alternatively, 
codes indicating the number of frames between embedded watermarks are used to 
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check the number of received frames relative to the number of frames indicated by the 
watermark payload code. These codes enable the receiver to authenticate the stream 
and determine which portions, if any, are missing or have been altered. 

Some examples of a method for using time codes embedded in a media signal 
5 file include: 

1 . A method of authenticating a media signal file using auxiliary embedded 
data hidden in the file, the method comprising: 

extracting time codes from the auxiliary data hidden in the file; and 
checking the time codes to determine whether frames in the media signal file are 
10 complete. 

2. The method of feature 1 wherein the media signal file comprises a video file. 

3. The method of feature 1 wherein the media signal file comprises an audio 

15 file. 

4. The method of feature 1 wherein the auxiliary data comprises a hidden 
digital watermark imperceptibly embedded by altering data samples of a video or audio 
signal in the media signal file. 

20 

5. The method of feature 1 wherein the time codes indicate a number of frames 
between selected time frames in the media signal file, and enable verification that the 
number of frames are present in the media signal file. 

25 6. The method of feature 1 wherein the time codes are embedded in an ordered 

time sequence in frames within the media signal. 

7. The method of feature 6 wherein the time codes are extracted and an order of 
the extracted time codes is analyzed to determine whether the media signal file has been 

30 tampered with. 

8. A computer readable medium on which is stored instructions for performing 
the method of feature 1. 
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Concluding Remarks 

Having described and illustrated the principles of the technology with reference 
to specific implementations, it will be recognized that the technology can be 
implemented in many other, different, forms. To provide a comprehensive disclosure 
5 without unduly lengthening the specification, applicants incorporate by reference the 
patents and patent applications referenced above. 

The methods, processes, and systems described above maybe implemented in 
hardware, software or a combination of hardware and software. For example, the 
embedding processes may be implemented in a programmable computer or a special 

10 purpose digital circuit. Similarly, detecting processes may be implemented in software, 
firmware, hardware, or combinations of software, firmware and hardware. The 
methods and processes described above maybe implemented in programs executed 
from a system's memory (a computer readable medium, such as an electronic, optical or 
magnetic storage device). i 

15 The particular combinations of elements and features in the above-detailed 

embodiments are exemplary only; the interchanging and substitution of these teachings 
with other teachings in this and the incorporated-by-reference patents/applications are 
also contemplated. 



WO 02/23905 



PCT/US01/28726 



-16- 



I claim: 

1 . A method of embedding a digital watermark into a video signal such that the 
digital watermark is substantially imperceptible in the video signal, the method 
comprising: 

5 computing a time based perceptual mask comprising gain values corresponding 

to locations within a frame, where the gain value for a location in the frame is changed 
as a function of the change in one or more pixel values at the location over time; and 

using the gain values of the time based perceptual mask to control embedding of 
corresponding elements of a digital watermark signal such that the perceptibility of the 
10 elements of the digital watermark signal is reduced in time varying locations of the 
video signal. 



2. The method of claim 1 wherein the gain is reduced at a location in a frame of 
video where changes in pixel values over time at that location indicate that data hiding 
1 5 capacity of the location is reduced. 



3. The method of claim 1 wherein the gain is reduced at a location in a frame of 
video where the change in pixel values over time is highly varying, indicating that the 
data hiding capacity of the location is reduced. 

4. The method of claim 1 including: 

computing a perceptual mask that is a function of the time based mask and a 
function of an intraframe mask calculated as a function of signal activity within a 
frame. 

5. A method of embedding a digital watermark into a video signal such that the 
digital watermark is substantially imperceptible in the video signal, the method 
comprising: 

computing a mostly invisible watermark for one or more objects of the 

video 

having the watermark move with the object. 
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6. The method of claim 5 including segmenting a composite video into objects. 

7. The method of claim 5 wherein the video is recorded and saved as 
5 independent objects. 

8. The method of claim 5 including embedding a calibration signal in the 
composite video to synchronize each object's watermark payload. 



10 
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